Selective Labeling via Error Bound Minimization

نویسندگان

  • Quanquan Gu
  • Tong Zhang
  • Chris H. Q. Ding
  • Jiawei Han
چکیده

In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recovery of signals under the condition on RIC and ROC via prior support information

In this paper, the sufficient condition in terms of the RIC and ROC for the stable and robust recovery of signals in both noiseless and noisy settings was established via weighted l1 minimization when there is partial prior information on support of signals. An improved performance guarantee has been derived. We can obtain a less restricted sufficient condition for signal reconstruction and a t...

متن کامل

Batch-Mode Active Learning via Error Bound Minimization

Active learning has been proven to be quite effective in reducing the human labeling efforts by actively selecting the most informative examples to label. In this paper, we present a batch-mode active learning method based on logistic regression. Our key motivation is an out-of-sample bound on the estimation error of class distribution in logistic regression conditioned on any fixed training sa...

متن کامل

A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

A novel linear feature selection algorithm is presented based on the global minimization of a data-dependent generalization error bound. Feature selection and scaling algorithms often lead to non-convex optimization problems, which in many previous approaches were addressed through gradient descent procedures that can only guarantee convergence to a local minimum. We propose an alternative appr...

متن کامل

Discriminative Similarity for Clustering and Semi-Supervised Learning

Similarity-based clustering and semi-supervised learning methods separate the data into clusters or classes according to the pairwise similarity between the data, and the pairwise similarity is crucial for their performance. In this paper, we propose a novel discriminative similarity learning framework which learns discriminative similarity for either data clustering or semi-supervised learning...

متن کامل

Exact Recovery for Sparse Signal via Weighted l_1 Minimization

Numerical experiments in literature on compressed sensing have indicated that the reweighted l1 minimization performs exceptionally well in recovering sparse signal. In this paper, we develop exact recovery conditions and algorithm for sparse signal via weighted l1 minimization from the insight of the classical NSP (null space property) and RIC (restricted isometry constant) bound. We first int...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012